Build integrated Python package library #3144

tpboudreau · 2020-06-03T17:14:57Z

PLEASE DO NOT MERGE WITHOUT MODIFICATION -- SEE BELOW.

This PR introduces a new build option that builds lib_lightgbm.{so,dll} with the GPU code enabled, but without any runtime dependency on an OpenCL ICD loader or Boost libraries. It does this by integrating the Khronos ICD Loader and Boost filesystem and system libraries into the LightGMB shared library at build time (akin to static linking). This build strategy is useful for building the library for inclusion in a Python wheel targeted for Windows, as it allows the package to run on both GPU and non-GPU (OpenCL) equipped machines.

There are at least two issues that must be resolved before merging:

the static build technique also works on Linux, but will require a small upstream change in the build options for the Khronos ICD library. For now, I'm referencing my fork with that change in the Khronos build script. Depending on community feedback (that is, whether this build option would be useful on Linux too) I'll either pursue the upstream change in the Khronos library or pull out the Linux build code from this PR.
Ideally, when building a dll for inclusion in a Windows wheel, the library name would contain a unique stamp or hash of some kind as a failsafe to avoid conflicts with other copies of the library that might be present on the runtime system (see for example this discussion: https://discuss.python.org/t/packaging-dlls-on-windows/1401). The current dll build options don't do this. I've included logic in the new build path to generate such a stamp, but I've commented out the lines of the script that rename the final library until I hear back from the community on whether this approach is desirable. (It would require some changes to setup.py and perhaps other files, and maybe it belongs in a different PR.)

To run this build you can do: "python setup.py install --opencl-python-package".

Alternatively, to build directly from a git-bash command line on Windows run:

mkdir build && cd build
cmake .. -D__OPENCL_PYTHON_PACKAGE=ON -Wno-dev
cmake --build . --target _lightgbm --config "Release"

You can also build with the equivalent MSVC GUI actions.

On Linux, build with:

mkdir build && cd build
cmake -D__OPENCL_PYTHON_PACKAGE=ON -Wno-dev -DCMAKE_BUILD_TYPE=Release ..
make

(As mentioned above, the Linux build won't run to completion until we resolve the Khronos upstream change issue.) [EDIT: the Linux build should work now, but the upstream change issue still must be resolved before merging.]

ghost · 2020-06-03T17:15:14Z

All CLA requirements met.

jameslamb · 2020-06-08T17:49:23Z

CMakeOpenCLPythonPackageCache.txt

+set(USE_TIMETAG OFF CACHE BOOL "")
+set(USE_DEBUG OFF CACHE BOOL "")
+set(BUILD_FOR_R OFF CACHE BOOL "")
+


I don't have other comments to offer yet, but can you please be sure that CMake cache files are not checked into the repo?

Maybe add a *Cache.txt entry to .gitignore, I don't think that would conflict with anything we have intentionally checked in.

Thanks for the early feedback.

The file was meant as a cmake preload file, but since it's likely to cause further confusion, I've moved its contents into the main CMakeLists.txt file and removed the offending file.

tpboudreau · 2020-06-09T19:19:01Z

I don't believe the failing test (doc build) is related to my patch.

tpboudreau · 2020-07-01T15:26:52Z

Just checking in to see if there are any comments on this PR. I don't believe the test failures are related to my changes (two are R-package issues and one seems like a transient download error). Thanks.

guolinke · 2020-07-07T10:19:20Z

@StrikerRUS can you help to review this PR?

StrikerRUS · 2020-07-07T13:04:39Z

@guolinke

can you help to review this PR?

Sorry, I'm not sure that I'm able to provide a thoughtful review for this PR as I'm not familiar enough with OpenCl. I believe that @huanzhang12 should be the main reviewer for this PR. Of course, after his review I'll review the Python part.

itamarst · 2020-07-28T15:32:07Z

@huanzhang12 any chance you could review this? Thank you!

StrikerRUS

@tpboudreau Thanks a lot for your amazing work! Please take a look at one more round of review when have time.

CMakeIntegratedOpenCL.cmake

StrikerRUS · 2020-09-13T00:41:18Z

CMakeIntegratedOpenCL.cmake

+endif()
+list(APPEND INTEGRATED_OPENCL_INCLUDES ${OPENCL_ICD_LOADER_HEADERS_DIR})
+list(APPEND INTEGRATED_OPENCL_LIBRARIES ${opencl-icd-loader_BINARY_DIR}/Release/OpenCL.lib cfgmgr32.lib runtimeobject.lib)
+list(APPEND INTEGRATED_OPENCL_DEFINITIONS CL_TARGET_OPENCL_VERSION=120)


Could you please comment on this from compatibility side? Why 1.2?

This appears to be the ~~latest~~ earliest stable version that contains all the required api's (and 2.x seems to have been a long experiment: https://www.extremetech.com/computing/309842-opencl-3-0-kicks-off-with-a-huge-step-backwards).

But the library compiles without specifying the version (with a few extra messages), so we can remove this if you prefer.

Thanks for the clarification! I'm afraid I have too little knowledge about OpenCL to take a responsibility for this change. I'd better leave it for your decision or more experienced in OpenCL maintainer.

I just want to make it possible that as many as possible users will be able to simply download our wheel and run LightGBM with GPU support.

Yes, including the 1.2 guard when you compile the headers prevents you from inadvertently including a call to a 2.x function, which would limit the library to newer implementations. OpenCL versions are backward compatible, so a 2.x implementation can run a 1.2 application/library. (See item 3 here for example: http://developer.amd.com/wordpress/media/2013/12/AMD_APP_SDK_FAQ1.pdf)

Excellent! Thanks again!

CMakeIntegratedOpenCL.cmake

CMakeLists.txt

StrikerRUS · 2020-09-13T05:39:20Z

CMakeIntegratedOpenCL.cmake

+set(BOOST_BOOTSTRAP "${BOOST_BASE}/source/bootstrap.bat")
+set(BOOST_BUILD "${BOOST_BASE}/source/b2.exe")
+set(BOOST_FLAGS "")
+list(APPEND BOOST_SUBMODULES "libs/*" "tools/*")


I guess we can workaround 1.5Gb+ size problem by specifying only required submodules here.

IIRC the issue is that all boost libraries were being cloned even though only a handful were needed and I couldn't figure out a way to get around this.
#3144 (comment)

Starting from chrono, filesystem, headers, system, tools/* by trials and errors I came up with the following list which allows to compile successfully, but I bet you know better which submodules are needed. Also, I believe we can reduce number of submodules under the tools folder.

Suggested change

list(APPEND BOOST_SUBMODULES "libs/*" "tools/*")

list(APPEND BOOST_SUBMODULES "libs/algorithm" "libs/align" "libs/any" "libs/array" "libs/assert" "libs/bind" "libs/chrono" "libs/compute" "libs/concept_check" "libs/config" "libs/container" "libs/container_hash" "libs/core" "libs/detail" "libs/filesystem" "libs/foreach" "libs/format" "libs/function" "libs/function_types" "libs/fusion" "libs/headers" "libs/integer" "libs/io" "libs/iterator" "libs/lexical_cast" "libs/math" "libs/move" "libs/mpl" "libs/multi_index" "libs/numeric/conversion" "libs/optional" "libs/predef" "libs/preprocessor" "libs/property_tree" "libs/range" "libs/ratio" "libs/serialization" "libs/smart_ptr" "libs/static_assert" "libs/system" "libs/throw_exception" "libs/tuple" "libs/typeof" "libs/type_index" "libs/type_traits" "libs/utility" "libs/uuid" "libs/winapi" "tools/boost_install" "tools/build")

I know, this list is looking super scary, but explicitly listing submodules it is possible to decrease build time and downloading size significantly (1.5Gb+ -> 700Mb)

Also, I believe we can reduce number of submodules under the tools folder.

OK, I've experimented more and it looks like ~~we do not need submodules from tools/ at all~~. I'm updating my yesterday original comment with new suggestion.

UPD: Was wrong. tools/boost_install and tools/build are needed.

@StrikerRUS , thank you for doing this exploration.

I'm not opposed to this change, and I've committed the suggested patch, but I think there are some considerations the maintainers should be aware of.

One of the challenges in this patch was coaxing three different build systems (Boost's B2, cmake, and MSVC/make) to cooperate. And as you know I was originally targeting more than just the Windows platform. I found that the Boost dependency list generated at build time differed significantly by platform -- on Linux (generally) only the few expected dependent submodules were built, on Windows (generally), many more submodules were fetched. This was a bit surprising, but I didn't spend a lot of time chasing down why, because it seemed to me that downloading all submodules was always safe, and that even if I figured it all out for each platform, it would lead to complicated platform checking code that might turn out to be fragile in the long run (meaning a risk of not working anymore if the Boost libs or any of the three build systems changed or started interacting differently).

Of course I realize that there may be costs associated with fetching and processing code unecessarily in the CI pipeline, and also that non-Windows builds may never be needed, so I'm OK with the suggested change. I just wanted to alert you all that you may have to rethink this targeted download approach if other platforms are added or the build becomes unstable.

OK, I understand! Please correct me: as long as we fix a particular commit in the boost repository and building the library only for Windows users, it is safe to go with my list. In future we can update this list to support building on Linux machines or add conditioning if(WIN32) ... else() ... or fall back to downloading all submodules for all OSes.

Despite that we are using only free CI services (BTW, it will be not true anymore after introducing CUDA support #3160), we build nightly artifacts for each commit in master and in general I believe it is good practice to download things only that are needed.

Not related to this PR, but just my curiosity. Don't you know were there any changes in macOS related to OpenCL support? Refer to #1523 (comment) and #667. I believe some macOS users with eGPUs will benefit a lot.

I'm sorry, but I'm not too familiar with the details of OpenCL on MacOS. But the #667 issue you linked to seems (at first glance) like it could be resolved by an approach similar to this PR. I'll look into it further when I get a bit of time.

Co-authored-by: Nikita Titov <[email protected]>

Targeted download of Boost submodules. Co-authored-by: Nikita Titov <[email protected]>

StrikerRUS

@tpboudreau Excellent contribution! I hope it will greatly help to make GPU installation easier for users. Just one super minor comment below.

CMakeLists.txt

Co-authored-by: Nikita Titov <[email protected]>

tpboudreau · 2020-09-17T21:35:13Z

Looks green after the latest sync to master.

Thanks @StrikerRUS and @jameslamb for your attention and contributions to this PR.

StrikerRUS · 2020-09-21T14:55:13Z

@tpboudreau Could you please resolve two conflicts and I think we can merge this PR.

tpboudreau · 2020-09-21T17:29:43Z

I think this is ready for merging. I believe the failing test is a resource availability issue unrelated to this patch -- please let me know if you think otherwise.

StrikerRUS · 2021-07-09T11:43:06Z

Hey @tpboudreau @itamarst ! Is there any news about integration of OpenCL on non-Windows platforms?

github-actions · 2023-08-23T18:57:12Z

This pull request has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

tpboudreau added 3 commits June 1, 2020 14:41

Add specialized OpenCL/Python package build path

b1841bd

Refer to upstream OpenCL repository

27dffcb

Reset build job count in setup.py

2958084

tpboudreau requested review from btrotta, chivee, guolinke, henry0312, huanzhang12, jameslamb, Laurae2, StrikerRUS and wxchan as code owners June 3, 2020 17:14

TEMPORARY: refer to OpenCL fork to ensure Linux CI builds succeed

519a5b9

itamarst mentioned this pull request Jun 3, 2020

Windows GPU wheels #3143

Closed

7 tasks

tpboudreau added 2 commits June 3, 2020 19:45

Remove intermediate cmake target

84b6804

Restrict OpenCL headers to documented API version

ec4b3f4

jameslamb reviewed Jun 8, 2020

View reviewed changes

tpboudreau added 3 commits June 8, 2020 20:06

Use command line definition to activate integrated build

1fc7878

Merge branch 'master' into build_python_package_library

8e66b4e

Flag reference to unofficial repo with FIXME

af176a6

tpboudreau changed the title ~~[WIP] Build integrated Python package library~~ Build integrated Python package library Jun 9, 2020

a-wozniakowski mentioned this pull request Jun 18, 2020

[python][scikit-learn] new stacking tests and make number of features a property #3173

Merged

jameslamb added the feature label Jun 23, 2020

Merge branch 'master' into build_python_package_library

e0c71cb

StrikerRUS requested changes Sep 13, 2020

View reviewed changes

StrikerRUS mentioned this pull request Sep 14, 2020

[R-package] mark CMake option for building R package as private #3397

Merged

tpboudreau and others added 10 commits September 14, 2020 11:52

Small cleanups

d5fad62

Update CMakeIntegratedOpenCL.cmake

dc5ef75

Co-authored-by: Nikita Titov <[email protected]>

Update CMakeIntegratedOpenCL.cmake

4fd05bc

Co-authored-by: Nikita Titov <[email protected]>

Update CMakeIntegratedOpenCL.cmake

8a23901

Co-authored-by: Nikita Titov <[email protected]>

Update CMakeIntegratedOpenCL.cmake

8f140dc

Co-authored-by: Nikita Titov <[email protected]>

Update CMakeLists.txt

f654d91

Co-authored-by: Nikita Titov <[email protected]>

Update CMakeLists.txt

2597549

Co-authored-by: Nikita Titov <[email protected]>

Update CMakeLists.txt

db60d59

Co-authored-by: Nikita Titov <[email protected]>

Update CMakeIntegratedOpenCL.cmake

e274c8e

Targeted download of Boost submodules. Co-authored-by: Nikita Titov <[email protected]>

Merge from master, with conflict resolution

e4d2e3a

StrikerRUS approved these changes Sep 15, 2020

View reviewed changes

CMakeLists.txt Outdated Show resolved Hide resolved

tpboudreau and others added 2 commits September 17, 2020 13:30

Update CMakeLists.txt

6acfd26

Co-authored-by: Nikita Titov <[email protected]>

Merge branch 'master' into build_python_package_library

ee9f545

Merge from master, with conflict resolution

5d85aa1

StrikerRUS merged commit 3454698 into microsoft:master Sep 21, 2020

itamarst mentioned this pull request Sep 22, 2020

[python] Windows wheels that support GPUs via OpenCL #3403

Merged

StrikerRUS mentioned this pull request Nov 9, 2020

[R-package] [ci] Fix failures with R 3.6 and CMake (fixes #3469) #3541

Merged

jameslamb mentioned this pull request Jan 28, 2021

[R-package] Add an R GPU job in CI #3780

Closed

This was referenced Oct 8, 2021

[Question] LightGBM GPU version for setuptools #4653

Closed

Build Python wheels that support both GPU and CPU versions out of the box for non-Windows #4684

Closed

jgiannuzzi mentioned this pull request May 30, 2022

Build integrated OpenCL Linux wheels #5252

Merged

jameslamb mentioned this pull request Jun 20, 2023

LightGBMError: GPU Tree Learner was not enabled in this build. Even when LightGBM was compiled successfully with CMake option -DUSE_GPU=1 #5928

Closed

github-actions bot locked as resolved and limited conversation to collaborators Aug 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Build integrated Python package library #3144

Build integrated Python package library #3144

tpboudreau commented Jun 3, 2020 •

edited

Loading

ghost commented Jun 3, 2020 •

edited by ghost

Loading

jameslamb Jun 8, 2020

tpboudreau Jun 8, 2020

tpboudreau commented Jun 9, 2020

tpboudreau commented Jul 1, 2020

guolinke commented Jul 7, 2020

StrikerRUS commented Jul 7, 2020

itamarst commented Jul 28, 2020

StrikerRUS left a comment

StrikerRUS Sep 13, 2020

tpboudreau Sep 14, 2020 •

edited

Loading

StrikerRUS Sep 15, 2020

tpboudreau Sep 17, 2020

StrikerRUS Sep 18, 2020

StrikerRUS Sep 13, 2020 •

edited

Loading

StrikerRUS Sep 13, 2020 •

edited

Loading

tpboudreau Sep 14, 2020 •

edited

Loading

StrikerRUS Sep 15, 2020

StrikerRUS Sep 15, 2020

tpboudreau Sep 17, 2020

StrikerRUS left a comment

tpboudreau commented Sep 17, 2020

StrikerRUS commented Sep 21, 2020

tpboudreau commented Sep 21, 2020

StrikerRUS commented Jul 9, 2021

github-actions bot commented Aug 23, 2023

Build integrated Python package library #3144

Build integrated Python package library #3144

Conversation

tpboudreau commented Jun 3, 2020 • edited Loading

ghost commented Jun 3, 2020 • edited by ghost Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tpboudreau commented Jun 9, 2020

tpboudreau commented Jul 1, 2020

guolinke commented Jul 7, 2020

StrikerRUS commented Jul 7, 2020

itamarst commented Jul 28, 2020

StrikerRUS left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tpboudreau Sep 14, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

StrikerRUS Sep 13, 2020 • edited Loading

Choose a reason for hiding this comment

StrikerRUS Sep 13, 2020 • edited Loading

Choose a reason for hiding this comment

tpboudreau Sep 14, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

StrikerRUS left a comment

Choose a reason for hiding this comment

tpboudreau commented Sep 17, 2020

StrikerRUS commented Sep 21, 2020

tpboudreau commented Sep 21, 2020

StrikerRUS commented Jul 9, 2021

github-actions bot commented Aug 23, 2023

tpboudreau commented Jun 3, 2020 •

edited

Loading

ghost commented Jun 3, 2020 •

edited by ghost

Loading

tpboudreau Sep 14, 2020 •

edited

Loading

StrikerRUS Sep 13, 2020 •

edited

Loading

StrikerRUS Sep 13, 2020 •

edited

Loading

tpboudreau Sep 14, 2020 •

edited

Loading